Fix weight loading issue #14016

ydshieh · 2021-10-15T06:41:55Z

What does this PR do?

Fix #14002 + add 2 more tests

(I uploaded the converted TF checkpoint to ydshieh). Might be better for @patrickvonplaten to upload to patrickvonplaten

Who can review?

@Rocketknight1 @patrickvonplaten @LysandreJik

ydshieh · 2021-10-15T06:52:59Z

The issue in #14002 comes from the fact, when using from_pt=True, the block shown at the end doesn't use load_weight_prefix.

However, it is required to extend the variable scope for the 2 components (encoder & decoder) of a TF composite models - to make the subsequent save_pretrained -> from_pretrained work after a creation from from_encoder_decoder_pretrained.

I feel it would be better to modify load_pytorch_weights_in_tf2_model to address this situation, but I tried to avoid modify this Hugging Face's TF core method.

transformers/src/transformers/modeling_tf_utils.py

Lines 1445 to 1449 in d5b82bb

    
           if from_pt: 
        
               from .modeling_tf_pytorch_utils import load_pytorch_checkpoint_in_tf2_model 
        
               # Load from a PyTorch checkpoint 
        
               return load_pytorch_checkpoint_in_tf2_model(model, resolved_archive_file, allow_missing_keys=True)

patrickvonplaten · 2021-10-20T07:54:05Z

Looks great to me!

ydshieh · 2021-10-28T18:07:24Z

Hi, I just realized that TFEncoderDecoderModel is released with v4.12.0 without this fix being merged.

patrickvonplaten

Merging after offline approval from @Rocketknight1

This reverts commit a67d47b.

* Start the work on TFVisionEncoderDecoderModel * Expose TFVisionEncoderDecoderModel * fix import * Add modeling_tf_vision_encoder_decoder to _ignore_modules in get_model_modules() * reorder * Apply the fix for checkpoint loading as in #14016 * remove attention_mask + fix VISION_DUMMY_INPUTS * A minimal change to make TF generate() work for vision models as encoder in encoder-decoder setting * fix wrong condition: shape_list(input_ids) == 2 * add tests * use personal TFViTModel checkpoint (for now) * Add equivalence tests + projection layer * style * make sure projection layer can run * Add examples * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Clean comments (need to work on TODOs for PyTorch models) * Remove TF -> PT in check_pt_tf_equivalence for TFVisionEncoderDecoderModel * fixes * Revert changes in PT code. * Update tests/test_modeling_tf_vision_encoder_decoder.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Add test_inference_coco_en for TF test * fix quality * fix name * build doc * add main_input_name * Fix ckpt name in test * fix diff between master and this PR * fix doc * fix style and quality * fix missing doc * fix labels handling * Delete auto.rst * Add the changes done in #14016 * fix prefix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make style Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

LysandreJik requested a review from Rocketknight1 October 16, 2021 00:35

patrickvonplaten approved these changes Oct 20, 2021

View reviewed changes

Fix weight loading issue

4a5f899

ydshieh force-pushed the fix_tf_enc_dec_weight_loading branch from 412bfed to 4a5f899 Compare November 13, 2021 15:34

ydshieh mentioned this pull request Nov 14, 2021

Use cross_attention_hidden_size in Encoder-Decoder models #14378

Merged

patrickvonplaten reviewed Nov 15, 2021

View reviewed changes

patrickvonplaten merged commit a67d47b into huggingface:master Nov 15, 2021

patrickvonplaten added a commit that referenced this pull request Nov 15, 2021

Revert "Fix weight loading issue (#14016)"

dd2d5e3

This reverts commit a67d47b.

patrickvonplaten mentioned this pull request Nov 15, 2021

Revert "Fix weight loading issue" #14406

Closed

ydshieh added a commit to ydshieh/transformers that referenced this pull request Dec 26, 2021

Apply the fix for checkpoint loading as in huggingface#14016

ac0bd79

ydshieh added a commit to ydshieh/transformers that referenced this pull request Jan 6, 2022

Add the changes done in huggingface#14016

68b32f2

ydshieh deleted the fix_tf_enc_dec_weight_loading branch May 5, 2022 10:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix weight loading issue #14016

Fix weight loading issue #14016

ydshieh commented Oct 15, 2021 •

edited

Loading

ydshieh commented Oct 15, 2021 •

edited

Loading

patrickvonplaten commented Oct 20, 2021

ydshieh commented Oct 28, 2021 •

edited

Loading

patrickvonplaten left a comment

Fix weight loading issue #14016

Fix weight loading issue #14016

Conversation

ydshieh commented Oct 15, 2021 • edited Loading

What does this PR do?

Who can review?

ydshieh commented Oct 15, 2021 • edited Loading

patrickvonplaten commented Oct 20, 2021

ydshieh commented Oct 28, 2021 • edited Loading

patrickvonplaten left a comment

Choose a reason for hiding this comment

ydshieh commented Oct 15, 2021 •

edited

Loading

ydshieh commented Oct 15, 2021 •

edited

Loading

ydshieh commented Oct 28, 2021 •

edited

Loading